-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rustc_metadata: dedupe strings to prevent multiple copies in rmeta/query cache blow file size #98851
Conversation
The ICE happens because decoding a We could build the |
This comment has been minimized.
This comment has been minimized.
But i can't write symbol table before encoding rust/compiler/rustc_metadata/src/rmeta/encoder.rs Lines 2206 to 2210 in 0e21a27
because some symbols may be missed at that point, and can't reorder table writes, because they immediately written into metadata blob. |
You could put the symbol table after the CrateRoot, and save the in-file position somewhere like we do for CrateRoot. |
116ceb7
to
55032c0
Compare
I just had a (hopefully) simpler idea: use a FxHashMap<Symbol, LazyValue> for encoding. We would encode each string only once, and emit the same LazyValue object for all other occurrences. This does not require to decode a large table upfront. |
LazyValue<?> of which type? |
String ? |
impl<'a, 'tcx> Encodable<EncodeContext<'a, 'tcx>> for Symbol { | ||
fn encode(&self, s: &mut EncodeContext<'a, 'tcx>) { | ||
if !s.symbol_table.contains_key(self) { | ||
let x = s.lazy(self.to_string()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this looks simpler, but fail on first write here:
thread 'rustc' panicked at 'assertion failed: `(left == right)`
left: `NodeStart(30)`,
right: `NoNode`', compiler\rustc_metadata\src\rmeta\encoder.rs:397:9
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
encode_crate_root
calls encode_crate_deps
that calls self.lazy_array
rust/compiler/rustc_metadata/src/rmeta/encoder.rs
Line 1762 in 53792b9
self.lazy_array(deps.iter().map(|&(_, ref dep)| dep)) |
that change state to LazyState::NodeStart
rust/compiler/rustc_metadata/src/rmeta/encoder.rs
Lines 394 to 406 in 53792b9
fn lazy_array<T: ParameterizedOverTcx, I: IntoIterator<Item = B>, B: Borrow<T::Value<'tcx>>>( | |
&mut self, | |
values: I, | |
) -> LazyArray<T> | |
where | |
T::Value<'tcx>: Encodable<EncodeContext<'a, 'tcx>>, | |
{ | |
let pos = NonZeroUsize::new(self.position()).unwrap(); | |
assert_eq!(self.lazy_state, LazyState::NoNode); | |
self.lazy_state = LazyState::NodeStart(pos); | |
let len = values.into_iter().map(|value| value.borrow().encode(self)).count(); | |
self.lazy_state = LazyState::NoNode; |
Where it founds Symbol
and tries to encode it but can't, because of wrong state.
This comment has been minimized.
This comment has been minimized.
Kinda works, size reduced. Perf run please? Looks like slowdown, that random seeks inside of file is bad. |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 1ffb3867f739acfafb0d1c6f61aeab0a8f7bf9ca with merge 7d06dad2f206c313c621942c033c8200d8a664b4... |
☀️ Try build successful - checks-actions |
Queued 7d06dad2f206c313c621942c033c8200d8a664b4 with parent efb171e, future comparison URL. |
Finished benchmarking commit (7d06dad2f206c313c621942c033c8200d8a664b4): comparison url. Instruction count
Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Footnotes |
Where to store that table? |
You can put it in CrateMetadata, which already contains many side tables to accelerate decoding. |
And that was implemented in 55032c05a408fdbf61a8113f1a69cef25a86a799 where DecodeContext.cdata wasn't ready at that point, but i'll try again. |
The hash-map would only be there to accelerate decoding, but not necessary for correctness. We could just switch to the slower path of re-decoding + re-interning if |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 7325ba186339602fd7f818f8021262dc7f178e28 with merge 7b8bc869db839305776669feafbbcb22bafffa7f... |
☀️ Try build successful - checks-actions |
Queued 7b8bc869db839305776669feafbbcb22bafffa7f with parent a39bdb1, future comparison URL. |
Finished benchmarking commit (7b8bc869db839305776669feafbbcb22bafffa7f): comparison url. Instruction count
Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Footnotes |
Uhh, pretty bad, need to drop last commit. Remembering, that when i tried different approach of encoding/decoding strings as table, there was similar(?) slowdown, as DecodeContext was created a lot of times. Will try to investigate it later. |
Let's merge as it is, i dropped last commit that caused regression in #98851 (comment), so it should be like #98851 (comment). |
@bors r+ rollup=never |
☀️ Test successful - checks-actions |
Finished benchmarking commit (71ecf5d): comparison url. Instruction count
Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression Footnotes |
This is a small but consistent slowdown across multiple benchmarks. It also reduces max-rss on a number of benchmarks, mostly for @klensy: what are the file size improvements? |
https://perf.rust-lang.org/compare.html?start=0b79f758c9aa6646606662a6d623a0752286cd17&end=71ecf5d359bf750cc171e124779a46985633439d&stat=size%3Acrate_metadata We can take a look :-) Assuming this is the correct metric. |
Instructions regression mostly limited to doc build. Improvements on file sizes: This counts also libs with rustc metadata embed (rlibs?), like stdlib/test (can check that on next nightly) |
Size difference between 0b79f75...e1b28cd of |
@rustbot label: perf-regression-triaged |
…bols, r=bjorn3 Symbols: do not write string values of preinterned symbols into compiled artifacts r? `@bjorn3` Followup for rust-lang#98851 rust-lang#98851 (comment)
r? @cjgillot
Encodes strings in rmeta/query cache so duplicated ones will be encoded as offsets to first strings, reducing file size.